A Research of MapReduce with GPU Acceleration
نویسندگان
چکیده
MapReduce is an efficient distributed computing model on large data sets. The data processing is fully distributed on huge amount of nodes, and a MapReduce cluster is of highly scalable. However, single-node performance is gradually to be a bottleneck in computeintensive jobs, which makes it difficult to extend the MapReduce model to wider application fields such as largescale image processing and image mining. As an attempt, this paper presents an approach of GPU-accelerated MapReduce, which is implemented by Hadoop and OpenCL. Being a distinctive feature, it aims at general and inexpensive hardware platform, and it is seamlessly integrated with Apache Hadoop, the most widely used MapReduce framework. As a heterogeneous multi-machine and many-core architecture, it targets at both dataand compute-intensive applications. An almost 2 times performance improvement has been validated, without any special optimization.
منابع مشابه
Accelerating SQL Database Operations on a GPU with CUDA: Extended Results
Prior work has shown dramatic acceleration for various database operations on GPUs, but only using primitives that are not part of conventional database languages such as SQL. This paper implements a subset of the SQLite virtual machine directly on the GPU, accelerating SQL queries by executing in parallel on GPU hardware. This dramatically reduces the effort required to achieve GPU acceleratio...
متن کاملXSD: Accelerating MapReduce by Harnessing the GPU inside an SSD
Considerable research has been conducted recently on near-data processing techniques as real-world tasks increasingly involve large-scale and high-dimensional data sets. The advent of solid-state drives (SSDs) has spurred further research because of their processing capability and high internal bandwidth. However, the data processing capability of conventional SSD systems have not been impressi...
متن کاملMapSQ: A MapReduce-based Framework for SPARQL Queries on GPU
In this paper, we present a MapReduce-based framework for evaluating SPARQL queries on GPU (named MapSQ) to largescale RDF datesets efficiently by applying both high performance. Firstly, we develop a MapReduce-based Join algorithm to handle SPARQL queries in a parallel way. Secondly, we present a coprocessing strategy to manage the process of evaluating queries where CPU is used to assigns sub...
متن کاملHadoop Mapreduce OpenCL Plugin
Modern systems generates huge amounts of information right from areas like finance, telematics, healthcare, IOT devices to name a few, the modern day computing frameworks like Mapreduce needs an ever increasing amount of computing power to sort, arrange and generate insights from the data. This project is an attempt to harness the power of heterogeneous computing, more specifically take benefit...
متن کاملHadoop+Aparapi: Making heterogenous MapReduce programming easier
Lately, programmers have started to take advantage of GPU capabilities of cloud-based machines. Using the GPUs can decrease the number of nodes required to perform the computation by increasing the productivity per node. We combine Hadoop, a widely-used MapReduce framework, with Aparapi, a new Java-to-OpenCL conversion tool from AMD. We propose an easy-to-use API which allows easy implementatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012